-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
safekeeper: send AppendResponse
on segment flush
#9692
Conversation
@arssher Just submitting this for an early look to see if you agree with the overall approach. This needs a few tweaks and tests/benchmarks. |
5490 tests run: 5247 passed, 0 failed, 243 skipped (full report)Flaky tests (2)Postgres 17
Postgres 15
Code coverage* (full report)
* collected from Rust tests only The comment gets automatically updated with the latest test results
e7e2683 at 2024-11-16T15:49:03.309Z :recycle: |
10223d1
to
1cf5b69
Compare
Here are some benchmark results to illustrate the need for #9698 before merging this.
Before this change, With this change, the Without #9698, this increases the number of control file flushes from 2 to 60 (each with 3 fsyncs on the ingest path), reducing throughput by 30%:
With #9698, only 5 control file flushes happen, and more importantly, these happen off of the ingest hot path. Thus throughput remains unchanged:
So #9698 is a necessary prerequisite to this PR. |
1cf5b69
to
71f04f9
Compare
This should be ready for review now. We don't have any tests for The benchmarks in #9692 (comment) confirm that this results in more frequent commits and no performance regression (assuming #9698 merges first). I'll add a separate benchmark measuring commit latency as part of #9690. |
Approach LGTM. |
Thanks! I think this should be good for a final review -- anything you think is missing? |
4bfdf46
to
e7e2683
Compare
Problem
When processing pipelined
AppendRequest
s, we explicitly flush the WAL every second and return anAppendResponse
. However, the WAL is also implicitly flushed on segment bounds, but this does not result in anAppendResponse
. Because of this, concurrent transactions may take up to 1 second to commit and writes may take up to 1 second before sending to the pageserver.Separately, we should consider flushing the WAL on transaction commits -- see #9690.
Resolves #9688.
Summary of changes
Advance
flush_lsn
when a WAL segment is closed and flushed, and emit anAppendResponse
. To accommodate this, track theflush_lsn
in addition to theflush_record_lsn
.Note that this will result in more frequent commits during pipelined WAL ingestion, resulting in a control file flush (3 fsyncs) on every segment bound. We should address #9663 first, e.g. by taking control file flushes off of the ingest hot path.
Checklist before requesting a review
Checklist before merging